Goto

Collaborating Authors

 distributionally robust federated averaging


Distributionally Robust Federated Averaging

Neural Information Processing Systems

In this paper, we study communication efficient distributed algorithms for distributionally robust federated learning via periodic averaging with adaptive sampling. In contrast to standard empirical risk minimization, due to the minimax structure of the underlying optimization problem, a key difficulty arises from the fact that the global parameter that controls the mixture of local losses can only be updated infrequently on the global stage. To compensate for this, we propose a Distributionally Robust Federated Averaging (DRFA) algorithm that employs a novel snapshotting scheme to approximate the accumulation of history gradients of the mixing parameter. We analyze the convergence rate of DRFA in both convex-linear and nonconvex-linear settings. We also generalize the proposed idea to objectives with regularization on the mixture parameter and propose a proximal variant, dubbed as DRFA-Prox, with provable convergence rates.



Review for NeurIPS paper: Distributionally Robust Federated Averaging

Neural Information Processing Systems

Additional Feedback: I have ready the response and other reviews. For concenrn about experimental evaluation, the response is mostly "[27] did it too", which is the concern I flag at the end of review. Not meeting a simple baseline is ignored. Moreover, [27] also tries to do experiment in a more realistic setup, where the simple baseline would not hold, which this work does not reproduce response does not mention. So I see this concern as not addressed.


Distributionally Robust Federated Averaging

Neural Information Processing Systems

In this paper, we study communication efficient distributed algorithms for distributionally robust federated learning via periodic averaging with adaptive sampling. In contrast to standard empirical risk minimization, due to the minimax structure of the underlying optimization problem, a key difficulty arises from the fact that the global parameter that controls the mixture of local losses can only be updated infrequently on the global stage. To compensate for this, we propose a Distributionally Robust Federated Averaging (DRFA) algorithm that employs a novel snapshotting scheme to approximate the accumulation of history gradients of the mixing parameter. We analyze the convergence rate of DRFA in both convex-linear and nonconvex-linear settings. We also generalize the proposed idea to objectives with regularization on the mixture parameter and propose a proximal variant, dubbed as DRFA-Prox, with provable convergence rates.